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ABSTRACT. We give aproof of the Ai conjecture in geometrically doubling metric spaces (GDMS), 
i.e. a metric space where one can fit not more than a fixed amount of disjoint balls of radius r in 
a ball of radius 2r. Our proof consists of three main parts: a construction of a random "dyadic" 
lattice in a metric space; a clever averaging trick from J5], which decomposes a "hard" part of a 
Calderon-Zygmund operator into dyadic shifts (adjusted to metric setting); and the estimates for 
^ ' these dyadic shifts, made in 1161 and later in 1191 . 
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1. INTRODUCTION 



^ . Recall that in IfTTl it was proved that 

u 



Theorem 1.1. If T is an arbitrary operator with a Calderon-Zygmund kernel, then 



\\ T \\l 2 (w)^L 2 -°°{w) + \\ T \\l 2 (w- 1 )^L 2 '°°{w- 1 ) ^ 2 \\ T \\l 2 (w)^L 2 (w) 

< C (Ha 2 + \\ T \\l 2 {w)^L 2 -<°{w) + II^'IIl^w- 1 )-^ 2 .-^- 1 ))- 

By T we denote the adjoint operator. Here of course only the right inequality is interesting. 
And it is unexpected too. The weak and strong norm of any operator with a Calderon-Zygmund 
kernel turned out to be equivalent up to additive term [w]a 2 - From this we obtained in IfTTl the 
tJ" ' result which holds for any Calderon-Zygmund operator. 

m : 

Theorem 1.2. ||r|| L2(vv) ^ 2(vv) < C- [w] A2 log(l + [w) Al ). 



By A2 conjecture people understand the strengthening of this claim, where the logarithmic term 
is deleted, in other words, a linear (in weight's norm) estimate of arbitrary weighted Calderon- 
Zygmund operator. In Q the A2 conjecture was proved for Calderon-Zygmund operators having 
more than 2d smoothness in W l . The A2 conjecture was fully proved in a preprint of T. Hytonen, 
see 0. The proof is based on the main theorem in the paper [17] of Perez-Treil-Volberg. Both 
IfTTl and (3l are neither short nor easy. 

The direct proof of A2 conjecture (without going through IfTTl ) was given in [81, and it was based 
on two ingredients: 1) a formula for decomposing an arbitrary Calderon-Zygmund operators into 
(generalized) dyadic shifts by the averaging trick, 2) on a polynomial in complexity and linear in 
weight estimate of the norm of a dyadic shift. 

The latter was quite complicated and was based on modification of the argument in Lacey- 
Petermichl-Reguera EJ. The former was rooted in the works on non-homogeneous Harmonic 
Analysis, like e. g. IfTTl - lfT5l . but with a new twist, which appeared first in Hytonen's [3] and was 
simplified in Hytonen-Perez-Treil-Volberg's ||8l. 

The averaging trick was a development of the bootstrapping argument used by Nazarov-Treil- 
Volberg lfTTl - lfT5l . where they exploited the fact that the bad part of a function can be made 
arbitrarily small. Using the original Nazarov-Treil-Volberg averaging trick would add an extra 
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factor depending on [w]a 2 to the estimate, so a new idea was necessary. The new observation in 
ll3l was that as soon as the probability of a "bad" cube is less than 1, it is possible to completely 
ignore the bad cubes (at least in the situation where they cause troubles). 

1.1. Structure of the paper. Here we give a proof of the A 2 conjecture in geometrically doubling 
metric spaces (GDMS), i.e. a metric space where one can fit not more than a fixed amount of 
disjoint balls of radius r in a ball of radius 2r. 
The paper is organized as follows: 

(i) A construction of a probability space of random "dyadic" lattice in a metric space is given 
in Section 121 

(ii) Averaging trick of Hytonen [3'] (but we think we simplified it) is given in Section [8T2l 

(iii) A linear estimate of weighted dyadic shift on metric space from lPT6l . which uses Bellman 
function technique, is given in Sections [7] and [8] For another proof of the linear estimate 
for weighted dyadic shifts, which can be easily adjusted to the metric case, we refer 

to mil. 

Our main result is the following. 

Theorem 1.3 (A2 theorem for a geometrically doubling metric space). Let X be a geometrically 
doubling metric space, /I and T as above, w € ^2^. In addition we assume that }X is a doubling 
measure. Then 

(!) \\ T \\L 2 (wd^L 2 (wd^ <c(r)[w] 2 , M . 

We postpone precise definitions to the Section [6] The precise definition of a geometrically 
doubling metric space is given in the next section. 

2. First step 

Consider a compact doubling metric space X with metric d and doubling constant A. Instead of 
d(x,y) we write \xy\. Precisely, the definition is the following. 

Definition 1. Suppose (X,\.\) is a metric space. We call it geometrically doubling with constant 
A, if for any x £ X and r > we can fit no more than A disjoint balls of radius r/2 in the ball 
B(x,r). 

As authors of O, we essentially use the idea of Michael Christ [2], but randomize his construc- 
tion in a different way. Therefore, we want to guard the reader that even though on the surface the 
proof below is very close to the proof from [6 1, however, our construction is essentially different, 
and so the proof of the assertion in our main lemma, which was not hard in 0, becomes much 
more subtle here. 

The main difference between the construction J6 1 and here is that the one here is of "bottom 
to top" type, meaning that the centers of "father cubes" are chosen randomly, after the centers of 
"son cubes" are fixed. The construction in goes "top to bottom", and it is not that clear to 
us why "father cubes" have enough independence from "son cubes" to ensure that in the model 
where elementary event is one dyadic lattice, the event for a cube of a lattice to be "bad" (see 
the definition below) with respect to cubes of the same lattice is strictly less then one. However, 
we still feel that the construction of (6l can most probably be used for the purposes of our result 
as well, we just feel that it is a bit more easy to follow that everything falls in its place with our 
construction below. 

We now proceed to the construction. 

For a number k > we say that a set G is a &-grid if G is maximal (with respect to inclusion) 
set, such that for any x, y 6 G we have d(x,y) > k. 

Let from now on diamX = 1. Take a small positive number 8 <^ I depending on the doubling 
constant of X and a large natural number N, and for eveiy M ^ N fix. Gm = {zm}> a cert ain 8 M - 
grid of X. Now take Gm and randomly choose a Gm-i = 8 N ~ -grid in Gn- Then take Gm-i and 
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randomly choose a Gn-2 = 8 N ~ 2 -grid in Gjv-i- Do this N times. Notice that Go consists of just 
one random point of Gn- 

We explain what is "randomly". Since X is a compact metric space, all G^'s are finite. There- 
fore, there are finitely many (N — l)-grids in Gn- We choose one of them with a probability 

1 

number of (N — 1) -grids in Gn 
Our first lemma is the following. 

Lemma 2.1. Fork = 0,...,N 

|J B(y,3S N - k )=X. 
Remark 1. For N + k, k > 0, instead ofN — k this is obvious. 

Proof. Take i£X. Then, since Gn is maximal, there exists a point yo G Gn, such that |xyo| ^ 8 N . 
Since Gn-\ is maximal in Gn, there is a point y\ G Gjv-i, such that IjoJiI ^ S^ -1 . Similarly we 
get y 2 ,..., yu and then 

\m\ < |*yo| + • • • + totl < ^ + . . . + 8 N - k = 8 N - k (\ + 8 + . . . + S k ) ^ < 25^"* 

1 — o 

□ 

Once we have all our sets Gn, we introduce a relationship -< between points. We follow |[6l 
and El. 

Take a point y^+i G Gu+\. There exists at most one y/t G Gk, such that ^ ^r. This is true 

since if there are two such points y^, y\, then 

\ybl\ < y , 

which is a contradiction, since G,t was a 5^-grid in Gjt+i. 

Also there exists at least one Zk G Gj, such that |jfc+iZfc| ^35*. This is tme by the lemma. 

Now, if there exists an yk as above, we set y^+\ -< y%. If no, then we pick one of Zk as above and 
set yk + i -< Zk- For all other jc G Gk we set -fi x. Then extend by transitivity. 

We also assume that yk -< yk- This is if yk on the left happened to belong already to Gu+\- 

We do this procedure randomly and independently, and treat same families of G^'s with different 
-<-law as different families. 

Take now a point yu G Gk and define 

Qy k = U B ^T66^- 

z<y k ,z£G t 

Lemma 2.2. For every k we have 

x= u cios(e v „) 

Remark 2. There is only one point in Go, and clos(Q y ),y G Go, is just X. But for small 8, 

X = |J clos(<2 Vl ) is a genuine (and random) splitting ofX. 

yieGi 

Proof. Take any x G X. By the previous lemma, for every m > k there exists a point x w G G m , 
such that |xx„,| ^ 35'". In particular, x m — > x. Fix for a moment x m . Then there are points y m -\ G 
G m _i, ...jt S G,t, such that x w -< y m -\ -< . ■ ■ ~< yk- In particular, x m G Q yk , where y^ depends on 
x m . Then 

\y k x\ < |^x m | + |x m x| < |^x m | +35'" < \ykX m \ +35*. 
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Moreover, by the chain of -<'s, we know that \ykX m \ ^ 105*. Therefore, 

\y k x\ < 155*. 

We claim that the set {y{\ = {yk{x m )} m ^k is bounded independently of m. This is true since all 
yk's are separated from each other and by the doubling of our space (we are "stuffing" the ball 
B(x, 155*) with balls B(y k , S k )). 

So, take an infinite subsequence x m that corresponds to one point y k G G k . Then we get*,,, G Q yk , 
x m — > x, so x G c\o&Qy k , and we are done. □ 

Remark 3. Since the space X is compact, our random procedure consists of finitely many steps. 
Therefore, our probability space is discreet. We suggest to think about all probabilities just as 
number of good events divided by number of all events. 

However, all our estimates will not depend on number of steps (and, therefore, diameter ofX), 
which is essential. 

Remark 4. We notice that in the Euclidian space, say, M, this procedure does not give a standard 
dyadic lattice. 

3. Second step: technical lemmata 

Define 

Q yk =X\ |J closg a . 

zk=£yk,zk&Gk 

In particular, 

e»c<2,,-ccios(e w ). 

Lemma 3.1 (Lemma 4.5 in (6l). Let mbe a natural number, e > 0, and 8"' ^ 100£. Suppose 
x G clos Q y and dist(x,X \ Q yk ) < e8 k . Then for any chain 

Zk+m ~< Zk+m-i -<...-< Zk+l -< Zk, 
such that x G closQ Zk+m , the following relationships hold 

100' 

Proof. Suppose \ziZ.j\ < -^j. We first consider a case when Zk = yk- Since Zj -< Zk = yk, we have 
B(zj, Hj) C Q yk C Q yk . Therefore, 

— < dist( Zj ,X\Qy k ) < dist(x,X\Qy k ) + dist(x,Zi) + dist(z.i,Zj) < s8 k + 58' + — 

If 5 is less than, say, jqqq, then we get a contradiction. 

The only not obvious estimate is that dist(x,Zi) < 55;. It is true since x G clos<2- A . +m . 

We have proved the lemma with assumption that z k = yk- Let us get rid of this assumption. We 
know that 

x G clos Q Zk+m C clos Q Zk . 

Also we have x G clos Q yk , so, since 

Q Zk = X \ |J clos Q Uk C X \ clos Q yk , 

Uk=£Zk 

we get x G X \ Q Zk . In particular, dist(x,X \ Q Zk ) = < e8 k , and we are in the situation of the first 
part. This finishes our proof. □ 

Lemma 3.2. Fix Xk G Gk- Then 

8 k ^ 

(2) P(3xi_! G G k -\ : \x k x k ^ \ < ^ a 

for some a G (0, 1). 



\ZiZj I ^ 77^7 , k ^ j < i ^ k + m. 
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Proof. We remind that we are in a compact metric situation. By rescaling we can think that we 
work with G\ and choose Go- We can even think that the metric space consists of finitely many 
points, it is X := G2. The finite set G\ CX consists of points having the following properties: 

1. Vx,y G G\ we have \xy\ > 8; 

2. if z G X \ G\ then 3x G G\ such that \zx\ < 8. 

These two properties are equivalent to saying that the subset G\ of X consists of points such 
that \/x,y G G\ we have |xy| > 8 and we cannot add any point from X to G\ without violating that 
property. In other words: G\ is a maximal set with property 1. 

Recall that here the word "maximal" means maximal with respect to inclusion, not maximal in 
the sense of the number of elements. 

Now we consider the new metric space Y = G\ and Go is any maximal subset such that 

(3) \/x,y G Go , \xy\ > 1 . 

In other words, we have 1. Vx, y G Go we have |xy| > 1; 
2. if z G Y \ G then 3x G G such that \zx\ < 1. 

There are finitely many such maximal subsets Go of Y. We prescribe for each choice the same 
probability. Now we want to prove the claim that is even stronger than ©. Namely, we are going 
to prove that given y G Y 



(4) P(3x GG : x =y) ^a, 

where a depends only on 8 and the constants of geometric doubling of our compact metric space. 

Let Y be any metric space with finitely many elements. We will color the points of Y into red 
and green colors. The coloring is called proper if 

1. every red point does not have any other red point at distance < 1; 

2. every green point has at least one red point at distance < 1. 

Given a proper coloring of Y the collection of red points is called l-lattice. It is a maximal (by 
inclusion) collection of points at distance > 1 from each other. 
What we need to finish the proof is 

Lemma 3.3. Let Y be a finite metric space as above. Assume Y has the following property: 

(5) In every ball of radius less than 1 there are at most d elements . 

Let Jz? be a collection of l-lattices in Y. Elements of ' Jz? are called L. Let v G Y. Then 

the number of l-lattices L such that v belongs to L 

> a > 0, 

the total number of l-lattices L 

where a depends only on d. 

Proof. Given v G Y consider all subsets of B(v, 1) \ v, this collection is called S". Let S G . We 
call Ws the collection of all proper colorings such that v is green, all elements of S are red, and all 
elements of B(v, 1) \ S are green. We call S all points in Y, which are not in B(v, 1), but at distance 
< 1 from some point in S. 

All proper colorings of Y such that v is red are called B. Let us show that 

(6) card W s < card B. 

Notice that if © were proved, we would be done with Lemma 1331 a > 2~ d+l , and, consequently, 

_v—D 

the proof of the main lemma would be finished, a > 2 , where D is a geometric doubling 
constant. 

To prove © let us show that we can recolor any proper coloring from Ws into the one from B, 
and that this map is injective. Let L G Ws- We 

1. Color v into red; 

2. Color S into green; 
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3. Elements of S were all green before. We leave them green, but we find among them all those 
y that now in the open ball B(y, 1) in Y all elements are green. We call them yellow (temporarily) 
and denote them Z; 

4. We enumerate Z in any way (non-uniqueness is here, but we do not care); 

5. In the order of enumeration color yellow points to red, ensuring that we skip recoloring of 
a point in Z if it is at < 1 distance to any previously colored yellow-to-red point from Z. After 
several steps all green and yellow elements of S will have the property that at distance < 1 there is 
a red point; 

6. Color the rest of yellow (if any) into green and stop. 

We result in a proper coloring (it is easy to check), which is obviously B. Suppose L\,L2 are 
two different proper coloring in Ws- Notice that the colors of v,S,B(v, 1) \ S, S are the same for 
them. So they differ somewhere else. But our procedure does not touch "somewhere else". So the 
modified colorings L[,L' 2 that we obtain after the algorithm 1-6 will differ as well may be even 
more). So our map Ws — > B (being not uniquely defined) is however injective. We proved ©. 

□ 

Thus, the proof of the Lemma l3T2l is finished. 

□ 

Remark. We are grateful to Michael Shapiro and Dapeng Zhan who helped us to prove Lemma 

4. Main definition and theorem 

Fix a number y, < y < 1. Later the choice of y will be dictated by the Calderon-Zygmund 
properties of the operator T. Also fix a sufficiently big r. The choice of r will be made in this 
section. 

Definition 2 (Bad cubes). Take a "cube" Q = Q Xk . We say that Q is good if there exists a cube 
Qi = Qx„, such that if 

8 k ^8 r 8" (k^n + r) 

then either 

dist(QM ^8 ky 8 n{l - y) 

or 

distiQ,X\Q l )>8 k *8< 1 -fi. 
Remark 5. Notice that 8 k = £(Q) just by definition. 

If Q is not good we call it bad. 
Theorem 4.1. Fix a cube Q Xk . Then 

P(Q Xk is bad)^^. 

Remark 6 (Discussion). This theorem makes sense because when we fix a cube <2/b say, k ^ N, 
so the grid Gk is not even random, we can make big cubes random. And we claim that for big 
quantity of choices, our big cubes will have Qt either "in the middle" or far away, but not close 
to the boundary. 

Definition 3. For Q = Q Xk define 

Sq(e) = 8 q = {x: dist(x,Q) ^ e8 k anddist(x,X\Q) sC s8 k } 

Lemma 4.2. Let us start with level N by fixing a 8 N -grid ( non-random ), and let k <N, Xk denoting 
the points of the (random) grid Gk- Fix a point x £ X. 

F(3x k £ G k : x £ 8 Qxk ) ^ e 11 

for some r\ > 0. 
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Proof of the theorem. Take the cube Q Xk . There is a unique (random!) point Xk- S such that Xk G 

2.V*-,- Then 

dist(Q Xv X \ Q Xk s ) ^ dist(x k ,X \ Q Xks ) - diam(Q Xk ) ^ dist(x k ,X \ Q Xk s ) - C8 k . 

Assume that dist(xk,X \ Q Xk _J > 28 ky 8( k ~ s ^ l ~^ and that s ^ r (this assumption is obvious, oth- 
erwise Q Xk _ s does not affect goodness of Q Xk ). 
Then, if r is big enough (S''^ 1 ^^ < ^) we get 

dist(Q Xk ,X\Q Xk _ s )^8 k ^ k -^-7\ 

and so Q Xk is good. Therefore, 

P(j2x t is bad) < C£P(x* G 8 Qk _ s (e = 28 s7 )) < c£5^ < lOOCS^. 

By the choice of TJ, for sufficiently large r this is less than \. □ 

Proof of the lemma. Let x^ be such that x G clos <2. Y/t (see Lemma l2.2l) . We will estimate P(dist(x,X\ 
Qk) < s8 k ) \x G clos j2x t ). Fix the largest m such that 500e ^ 8'". Choose a point x k + m such that 
x G clos Q Xk+m - Then by the main lemma 

fik+m— 1 



Therefore, 

Let now 
Then 



(zbCjt-i-m-i G Gk+m-l '■ \ x k+m x k+m-\\ < - [qqq - ) ^ a " 



<^k+m— 1 

PO^C/t+m-i G : |xfe +w Xfe +n ,_i | ^ ) < 1 — a. 



x k+m x k+m— 1 • 



P(Vx ( t +m _ 2 G Gk+m-2'- \ x k+m-l x k+m-2\ ^ ^qqq ) ^ 1 — «■ 



So by Lemma [37TI 

F{dist(x,X \ Q k ) < s8 k ) < P(|x^x^_!| ^ Vj = 1, . . . ,m) < (1 - a)'" < Ce" 

for 

_ log(l - a) 
17 " log(S) • 

□ 

5. Probability to be "good" is the same for every cube 

We make the last step to make the probability to be "good" not just bounded away from zero, 
but the same for all cubes. We use the idea from J9]- 

Take a cube Q{(o). Take a random variable %q((0 ), which is equally distributed on [0, 1]. We 
know that 

F(Q is good) = p Q > a > 0. 

We call Q "really good" if 

PQ 

Otherwise Q joins bad cubes. Then 

P(<2 is really good) = a, 

and we are done. 
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6. Application 

As a main application of our construction, we state the following theorem. 

Definition 4. Let X be a geometrically doubling metric space. 

Let X(x, r) be a positive function, increasing and doubling in r, i.e. X(x,2r) ^ CX(x,r), where 
C does not depend on x and r. 

Suppose K(x,y) : XxX ->lif a Calderon-Zygmund kernel, associated to a function X, i. e. 

(7) \K(x,y)\ ^Cmin ' 



X(x,\xy\) X(y,\xy\) 



be" 6 



(8) \K(x,y)-K(x',y)\ ^ C 11 \xy\ ^ C\xx'\, 

\xy\ e X{x,\xy\) 

\vv'\ e 

By B(x,r) we denote the ball in \ .\ metric, i.e., B(x,r) = {y E X : \yx\ < r}. 

Let \ibe a measure on X, such that /I (B(x, r) ) ^ CX (x, r), where C does not depend on x and r. 
We say that T is a Calderon-Zygmund operator with kernel K if 

(10) T is bounded L 2 (pL) — > L 2 (/i), 

(11) Tf(x) = J K(x,y)f(y)dn(y), Vx sup Pj u, V/ G Q. 

Definition 5. Let w > \i-a.e. Define 

w G A 2 , M ^ [w] 2 ,n = sup 1 f wd\i ■ 1 / w- x d\i < oo. 
x.r H{B(x,r)) J n(B(x,r)) J 

B(x,r) B(x,r) 

Theorem 6.1 (A2 theorem for a geometrically doubling metric space). Let X be a geometrically 
doubling metric space, }X and T as above, w G ^2,^. In addition we assume that jX is a doubling 
measure. Then 

( 12 ) \\ T \\L 2 (wd^L 2 (wd^ ^C(T,X)[w} 2 ^. 

Remark 7. We note that existence of such /I on any GDMS was proved in MIOL 

6.1. Proof of the theorem. Take two step functions, / and g. We first fix an A^-grid Gn in X, 
and "cubes" on level ^V, such that / and g are constants on every such cube. Then we start our 
randomization process. 

As we mentioned, this process consists of finitely many steps, so all probabilistic terminology 
becomes trivial: we have a finite probability space. 

Starting from Gn, we go "up" and on each level get dyadic cubes (random Christ's cubes). They 
have the usual structure of being either disjoint or one containing the other. For each dyadic cube 
Q we have several dyadic sons, they are denoted by Si(Q), i = 1, . . . ,M(Q) < M. The number M 
here is universal and depends only on geometric doubling constants of the space X. 

Definition 6. By S'k we denote set of all dyadic "cubes" of generation k. We call Q' k C <2j(._i> 
Q k G 4 sons ofQl_ v 

With every cube Q = Q Xk we associate Haar functions h J g, j = 1, . . . ,M — 1, with following 
properties: 

(i) h J Q is supported on Q; 

(ii) hg takes constant values on each "son" of Q; 

(iii) For any two cubes Q and R, we have (h J Q,h' R ) = 0, and (/ig, 1) = 0; 
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We notice that the last property implies that ||/jg||2 ^ C. 

We use angular brackets to denote the average: (/)q,h '■= j^q) Jq/^M- When we average over 
the whole space X, we drop the index and write (/} = ^7 f x fdjX. 

Our main "tool" is going to be the famous "dyadic shifts". Precisely, we call by §„ V1 the operator 
given by the kernel 



LeS> 

where 



f^lL a d x ,y)f(y)dy, 



adx,y)= £ c L jjh 3 j(x)hj(y), 

ICL,JcL 
g(I)=g(L)+m,g(J)=g(L)+n 



where h\,hj are Haar functions normalized in L 2 {djl) and satisfying (iv), and \cljj\ < — — ■ 

Often we will skip superscripts /, j. 

Our next aim is to decompose the bilinear form of the operator T into bilinear forms of dyadic 
shifts, which are estimated in the Section [8] The rest will be the so-called "paraproducts", esti- 
mated in the Section [7] 

Functions {xx } U {1i J q } form an orthogonal basis in the space L 2 (X,n). Therefore, we can write 

/ = (f)Xx +E£(/,A y >£, 8 = {g)Xx+Y J Y J {gA)h i R . 

Q j R i 

First, we state and proof the theorem, that says that essential part of bilinear form of T can be 
expressed in terms of pair of cubes, where the smallest one is good. We follow the idea of Hytonen 
151 . In fact, the work (3j improved on "good-bad" decomposition of ifTTTl . Ifl2l . IPT51 by replacing 
inequalities by an equality. 

Theorem 6.2. Let T be any linear operator. Then the following equality holds: 

7l good E £ (Th^h i R )(f,h^)(g 1 h i R ) = E £ (Th^h R )(f,h^)(g,h R ). 

<> « ■ • (.'•'>• • • 

t(Q)>t{R) e(Q)^e(R), Ris good 

The same is true if we replace ^ by >. 
Proof. We denote 

G 1 (T)= £ (Th J Q ,h' R )(f 1 h J Q )(g,h i R ). 



r is good 



We would like to get a relationship between Eoi (T) and Ea\ (T). 
We fix R and write (using g good := £ (g,h' R )h' R ) 

r is good 

£ £ (Th^h i R )(f,hi,)(g,hti=(T(f-(f)xx), £ (g,h i R W R }=(T(f-(f)Xx) 
2 Ris good y R is good 

Taking expectations, we obtain 
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(13) E£(r^ G) 4)(/,^)( g ,4)i Ri d = 

Q,R 

HT(f~ (f)Xx),g g ood) = (T(f-(f)Xx),Eg goo d) = 

n guod (T(f- (f)Xx),g) = ^ood^( Th Q^R^ h Q)M). 

Q,R 

Next, suppose l(Q) < £(R). Then goodness of R does not depend on Q, and so 

K g ood{Tti Q M R ){f,ti Q ){gM R ) = e ((r^ G ,4)(/,^ Q )U,4)i R is goodies) • 

Let us explain this equality. The right hand side is conditioned: meaning that the left hand side 
involves the fraction of the number of all lattices containing Q,R in this lattice and such that R (the 
larger one) is good to the number of lattices containing Q,R in it. This fraction is exactly % good - 
Now we fix a pair of Q,R, l(Q) < £{R), and multiply both sides by the probability that this pair 
is in the same dyadic lattice from our family. This probability is just the ratio of the number of 
dyadic lattices in our family containing elements Q and 7? to the number of all dyadic lattices in 
our family. After multiplication by this ratio and the summation of all terms with £(Q) < £(R) we 
get finally, 

(14) 7i good E £ (Thi } ,h i R )(f,h J Q )(g,h i R )=E £ (^ Q ,4)(/,^)fe^)l R i S good- 

e(Q)<f(R) t(Q)<e(R) 

Now we use first (fTBl and then (fT4l : 

(15) T^E^r/^X/,/^ 

Q,R Q-R 

= E £ (Th^t R )(f,h j Q )(g,h R )l Risgood + E £ {Th i Q M R ){f,h i Q ){g,h i R )\ R ^ gooA = 

e(Q)<e(R) m^(R) 

= 7i good E £ (r4,4)(/,^)( gj 4)+E £ (Th^Xf^Xg^), 

£(Q)<i(R) l{Q)^l{R),R\S good 

and therefore 

(16) E £ (Th J Q ,h R )(f,h } Q )(g,h R ) = 7i good E £ {Thi Q M R ){f,tf Q ){g,tt R ). 

£{Q)^e(R),Ris good t(Q)>t(R) 

□ 

This is the main trick. To have the whole sum expressed as the multiple of the sum, where 
the smaller in size cube is good, is very useful as we will see. It gives extra decay on matrix 
coefficients (Th J Q,h' R ) and allows us to represent our operator as "convex combination of dyadic 
shifts". 

So, we have obtained that 

E(7i(r) = ^-E^(f). 

Thus, to estimate Eai(r) it is enough to estimate Eoi(r). Absolutely the same symmetrically 
holds for o 2 (T). 

6.2. Paraproducts. In this subsection we take care of the terms {f)Xx an d {g)Xx- These terms 
will lead to so called paraproducts. In fact, let us introduce three auxiliary operators: 
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(17) 71(f) :=7i TXx (f) :=^(f) Q (TZx,hi)K; 

QJ 

(18) := £(/,^)(r^,4)-^- = (nT^nf); 

QJ VKU) 

(19) o(f);=(f)(T X x)Xx- 

Recall that (q>) denotes J x (pdjJ,. These operators depend on the dyadic grid we chose. We 
shall need the following technical lemma. 

Lemma 6.3. 

W),g) = {f){T%X,g-{g)Xx)+Z(^QMR){f,y Q ){gM R ), 

(x*(f),g) = (g)(T*Xx,f- (f)Xx) + ^h J Q ,h i R )(f,h J Q )(g,h i R ). 

Proof. The second equality follows from the first one and the definition of We prove the first 
equality. We will not write superscripts i and j in Haar functions. 
We write 

m = {fMXx) + Y J (fM Q )n(h i Q ). 

Notice that 

<Xx) = YUXxM^Hq = T Xx ~ (TXx), 
and that 7i(f) is orthogonal to X x- Thus, 

(n(f),g) = WU-(g)xx) = (/)(7rto),£fe' / 4)4) + I(^e>4)(/^e)te>4) = 

= (f)(T X x,g- (g)Xx) + Z,(xh i Q ,hi)(fJf Q )(g,hi), 
as desired. The last equality is true because (Txx) i s orthogonal to g — (g)Xx- D 
Notice that n, 71* depend on the random dyadic grid. We introduce a random operator 

T = Tf-n{f)-K*{f). 
Now we state the following very useful lemma. 
Lemma 6.4. 

(Tf, g ) = K go l od E £ (r^,4)(/,^)(g,4)+E(^(/), g )+E(^(/),g)+(/)(g)(r^,^). 

smaller is good 
Proof. First, we write 

(Tf,g) = £(r^,4)(/,^)(g,4) + (f)(TXx,g) + (g)(T*Xx,f-(f)Xx). 

We take expectations now. Notice that only the first term in the right-hand side depends on a 
dyadic grid. Therefore, 

(Tf,g) = E£(r4,4)(/,^)(g,4) + (f)(Txx,g) + (g)(T*Xx,f- (f)Xx). 
We focus on the first term. By the Theorem 16.21 we know that 

(20) K^(Th i Q ,hi)(f,h i Q )(g,hi) = 7i g ^ l K £ (Th i Q ,hi)(f,h i Q )(g,h{) = 

smaller is good 

= VL E I (fh i Q ,hi)(f,h i Q )(g,hi)+ 
smaller is good 

£ (n^htiif^&hti+n^E £ (^,4)(/.*fi)^^)- 
smaller is good smaller is good 

The first term is one of those that we want to get in the right-hand side. 
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On the other hand, we want to get a result for paraproducts, similar to the Theorem 16. 21 Indeed, 
it is clear that 

(nh i Q ,h J R ) = (h i Q ) R (Txx,h J R ), 
which is non-zero only if R C Q, and R^Q. So, 

(21) 

E £ (^ Q ,hi)(f^ Q )(g,hi) = E £ (tt Q ) R (T7b,ti R ){frf Q ){grf R )\ R]& good = 
smaller is good Rc Q 

= E 1 £(T Xx ,h R )(g,h R )l Ri d £ {ftf Q )(tt Q ) R . 

R Q.RCQ 

We now see that since / = (f)Xx + LC/ \h l )h l , we have 

6 

(f) R -(f) = (f,m- l x R )-(f)= I (/,^)(^)«=I(/^g)(^)«- 

Therefore, 
(22) 

R is good is good ((/)«-(/))■ 

R Q R 

Now it is clear that we can take the expectation inside (we have no Q anymore, which was pre- 
venting us from doing that), and so we get 

E £ = K good E^(Txx,h J R )(g,h J R )((f) R - (/))• 

smaller is good R 

Making all above steps backwards, we get 

E £ (^ Q ,h J R )(f^ Q )(g,h J R ) = 7r w E£(^,4)(/,4)(g,4) 
smaller is good 

Therefore, 
(23) 

*W E I (^b,4)C/.^)(g,4) + ^E £ Wb,4)(/,*fi)(*>4) = 
smaller is good smaller is good 

=E£(^,4)(/,4)( g) 4)+E£M ! G ,4)(/,^)( g) 4) = 

= E(^(/), 5 ) + E( 7 r,(/),g)-E[(/)(r^,g-to)]-E[(g)(r^,/-(/)^)]. 

We now use that last two terms do not depend on the dyadic grid, and so we drop expectations. 
Finally, 

(24) (Tf,g)=E £ (Th i Q Ji R )(f,h i Q )(g,hi)+E(x(f),g) + E(n,(f),g)- 

smaller is good 

- if)(TXx,g-{g)Xx) - (g)(T*Xx,f-(f)Xx) + (f)(Txx,g) + (g)(T* X x,f- (f)Xx) = 
= E £ (^,4)(/,^)(g,4)+E(^(/),g) + E(^(/),g) + (/)(g}(r^,^). 

smaller is good 

This is what we want to prove. □ 

The following lemma, which will be proved later, takes care of paraproducts. 
Lemma 6.5. The operators %, 7T* are bounded on L 2 (X,wdfl), and 

\\n\h,w ^ C - [w] 2 , M . 

The same is true for 7T*. 
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We postpone the proof of this lemma. We also notice that the operator 

o(f) = (f)(TXx)Xx 

is clearly bounded with desired constant. In fact, as T is bounded in the unweighted L 2 , we have 

(TXx) 2 < \\T\\ 2 L2 =: Co 

Hf)\\l w = (f) 2 (T Xx )Mx)^Co(f 2 w)(w- l )w(X)^C [w] 2 \\f\\l^ 

We, therefore, should take care only of the first term, with f . We now erase the tilde, and write T 
instead of t . Even though T is not a Calderon-Zygmund operator anymore, all further estimates 
are true for T (i.e., for a CZO minus paraproducts), see, for example, ||6] or JSJ. 



6.3. Estimates of G\. Our next step is to decompose 0\ into random dyadic shifts. We write 

(25) ~^(T) = £ (Th j Q ,h R )(f,h j Q )(g,h R ) = 
r is good 

= E £ {Th i Q M R )(f,h 3 Q ){g,h i R )+ 

e(Q)>s- r o£(R), 

RcQ, 
R is good 

+ E £ (Th{ 2l V R )(f,h J Q )(g,h i R )+ 

e(R)<£(Q)<8- r a£(R), 

r is good 

+ E £ (Th J Q,^ R ) (/, h^) (g, h R ) . 
r is good 

Essentially, we will prove that the norm of every expectation is bounded by 

C(T) .E£5- e(7 >||$„||. 



First, we state our choice for 7, which we have seen in the definition of good cubes. 
Definition 7. Put 

_ £ 
7= 2-( £ + log 2 (C))' 

where C is the doubling constant of the function X. 

Remark 8. We remark that this choice of y make Lemmata \6.6\ and \6. 7\ true. 
The estimate of the second sum is easy. In fact, 

E £ {Th j Q rf R )(f,h 3 Q ){grf R ) ^OoM 2 ||/||||g||. 

i(R)^i{Q)<S- r 0£(R) 1 

r is good 

This is bounded by at most ro expressions for shifts of bounded complexity, so just see |[l"6ll . For 
more details, see |H1 
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We denote 

I iV , = E £ (Thi w h> R ){f,h j Q ){gM R )> 

RcQ, 
r is good 

Z ollt =E £ {Tti Q M R ){f,hi Q ){gA)- 
sis good 



6.4. Estimate of We use the following lemma. 

Lemma 6.6. Let T be as before; suppose £{Q) ^ 8~ r °£(R) and R C Q. Let Q\ be the son of Q 
that contains R. Then 

UTti h')\<^i(-^-Y 

l(r/ ^l^ (e) f U(Gi)J • 

We notice that ju(gi) x jU(2)- 
We write 

£ (Th^,h R )(f,h^( g ,h R ), 

"> r o^(G)=5-"^(K)^is good,Rcg 



(26) £ l(^ Q ,4)||(/,4)||(g,4)l^ 

n>r e(Q)=8-"£(R), 
K is good, 

sis good, 

Reg 

ris good, 

RcQ 

We fix functions / and g and define S n as an operator with the following quadratic form: 

(SnU,v)= £ Jl^y (u^ Q ){yM R \ 

e(Q)=8- n £(R), VMlkfjy 
Ris good, 

Rce 

where ± is chosen so \(f,h J Q)\\(g,h' R )\ = ±(f,h J Q)(g,h R ). Then clearly S„ is a dyadic shift of 
complexity n, and so, see Section [U 

\(S n f,g)\^Cn a [w] 2 \\f\\ w \\gL-i- 

Therefore, 

|£J <£Cn 8 5"[ W ] 2 ||/|| w |kll^i ^CMall/IUIUIL-!. 
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6.5. Estimates for L out . We use the following lemma from 0. 

Lemma 6.7. Let T be as before, £(R) £(Q) and RflQ = 0. Then the following holds 

where D(Q,R) = £(Q) + £{R) + dist(<2,/?). 

Remark 9. We should clarify one thing here. IfT was a Calderon-Zygmund operator, this estimate 
would be standard, see iPTTIl , |[12l or, for metric spaces, 0. We, however, subtracted from T two 
operators: paraproduct and adjoint to paraproduct. However, an easy argument (see flU) shows 
that ifRDQ = 0, then (Th J Q,h' R ) = (Th J Q,h l Q) (for the definition off see Lemma \63\ and thereon). 

Suppose now that D(Q,R) ~ 8~ S £(Q). We ask the question: what is the probability 

F(R C Q {s+S0+lQ) \Q,R £ D m ), 

where sq is a sufficiently big number. We use the Lemma |4~2l Suppose that Rn q( s + s o+ 10 ) — 0. 
Suppose also R = R x (so x is the "center" of R). Then 

(27) dist{x, £(^°+ 10 )) < dist(x, Q) ^ dist{Q,R) s; C8- S £(Q) = c8- s 8 s+So+w £(Q {s+So+m) ) = 

= c8 So+10 £(q( s+So+] - ' ) ) 

So x e S Q ( s+so +io) (8 So+w )), and the probability of this is estimated by c^^ 10 ' < \ for sufficiently 
big sq (we remind that rj = log 5 (l — a)). Therefore, 



F(RcQ^ +S0+ ^\Q,R€D a )^^. 



So 



(28) \L ol<t \ <2E£ £ 1(^,4)11^^)11^^)11^ goodl^^iB) < 

t,s e(Q)=8->£(R), 
D(QJi)~8-%Q), 
#ng=0 

D(Q,R)~8-°t(Q), 
RC\Q=% 
R,QcQ s+s o+ [0 



m *(e)=s-'<?(*) 

sng=0, 

D(e,R)~5-^(G), 
sng=0, 

We now define S„ as we did before: 

D(Q,R)-5-^(Q), 
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We need to estimate the coefficient. We write 

(29) X(z,D(Q,R)) ~ Hz,8-*e(Q)) ~ X(z,S- s -* >- 10 l(Q)) ~ 

^ /l (g(»+«o+io)) j 

and therefore 



sup_ £/J A(z,D(g,/0) 1 Ai(e i+40+10 ) ' 
We notice that C does not depend on s since we used the doubling property of X only for trans- 
mission from 8-'£(Q) to 5-*-*°- 10 i(g). 

We conclude that 5„ is a dyadic shift of complexity at most C(j + ?). Therefore, see Section [U 

|E OHf K2CE£5f5 ie ( J+ o a M2||/IUklL-. ^cM 2 ||/|| w ||gL-i, 

and our proof is completed. 

7. PARAPRODUCTS AND BELLMAN FUNCTION 
Now we will prove the Lemma 1631 

We remind that the quadratic form of our paraproduct % is the following: 

WU) :=££</W^,4)te,/4)- 

if i 

Operator T is bounded in L 2 (/i) and pL is doubling. Therefore, it is well known that coefficients 
i>R ■= b' R := {TXxMr) satisfy Carleson condition for any of our lattices of Christ's dyadic cubes: 

(30) VgG^ £ \b R \ 2 <Bn(Q). 

Re$),RcQ 

The best constant B here is called the Carleson constant and it is denoted by ||£>||c- It is known that 
for our bR := (Txx,h' R ) Carleson constant is bounded by Bj := C H^Hz^f/iV-n^u)- 

If we would be on the line with Lebesgue measure }x and w would be a usual weight in A%, then 
the sum would follow the estimate of O. Beznosova HI: 

(31) \K TXx {f,g)\<Cs/B^\\w\\ Al . 

But the same is true in our situation. To prove that, one should analyze the proof in HI and see 
that it used always conditions on w and b separately. They were always split by Cauchy-Schwarz 
inequality. The only inequality, where w and b meet was of the type: let Q be a Christ's cube of a 
certain lattice, then 

(32) £ (w)^ R b 2 R < [w] A Jb\\ c fwdn, 

RcQ,Re& J Q 

where 



% = SU P - 
At 



Let us explain the last inequality. We write 

< [w] Aoo -exp((w) M , s ) = [w^-exp^M^)^ < [wUA w hl,R ^ [ W U„ jnf M{w^Xr) 2 . 



Finally, we notice that {b R } is a Carleson sequence, and finish our explanation with the follow- 
ing well known theorem. 
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Theorem 7.1. Suppose {oCk} is a Carleson sequence. Then for any positive function F the fol- 
lowing inequality holds: 

Y,a K mfF(x) < f F(x)dpL{x). 

In all other estimates in HI the sums with Aqw (see the definition before Lemma 3.2 of COO) and 
the sums with b are always estimated separately. The sums where the terms contain the product of 
Aqw and Z?g never got estimated by Bellman technique: they got split first. Then (PTT ) follows in 
our metric situation as well. 

8. Weighted estimates for dyadic shifts via Bellman function 

This section is here just for the sake of completeness. In fact, it just repeats the article of 
Nazarov-Volberg lfl6l . In this section we prove the following theorem. 

Theorem 8.1. Let S mn be a dyadic shift of complexity (n,m). Then 

\\&m,n\\wdfi <C(m+n+l) a [w] 2 , M . 

Remark 10. We notice that the best known a is equal to one. It can be gotten using the technique 
from Q or from 11191 . However, for the application we made in the previous sections, namely, the 
linear A2 bound for an arbitrary Calderdn—Zygmund operator on geometrically doubling metric 
space, the actual value of a is not important. 

We now give formal definitions. Let h'g be Haar functions as before, normalized in L . We also 
denote by g(Q) the generation of a "dyadic cube" Q. Then by §>„ V! we denote an operator 

£ / a L (x,y)f(y)dy, 

where 

adx,y)= £ c LJ:J h J j(x)hj(y) . 

ICLJCL 

g{I)=g(L)+m lg (J)=g(L)+n 

We denote a = w~ 1 . We begin with the following lemma. 
Lemma 8.2. 

h / I = ajti?< J + pjxr, 

where 

1) \ccj\ < VW/iJ' 

/ , w.j \ 

2) \Pj J \ < where w(I) := Jjwd^l, 

3) {h* ' J }i is supported on I, orthogonal to constants in L?{wdpL), 

4) hj' J assumes on each son s{I) a constant value, 

5 ) \\ h 7' J \\L 2 (wdti) = 1- 
Definition. Let 

&iw:= £ IM M)5 (/)-(w) Mi/ |. 

sons of / 

It is a easy to see that the doubling property of measure /I implies 

(33) |(/^',wV|<C(A / w) i u(/) 1/2 . 
Therefore, the property 2) above can be rewritten as 

2 ') \Pi\ ^ c \X\ 
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Fix (j> e L 2 (wdli),y G L 2 (o). We need to prove 
(34) |(§ OT) „0w,V<7)l <C(n + m+l)l0UM|a. 



We estimate (§ mi „0w, i//a) as 



L I J 



L I J 

EE ^/(M^t^t^V^/ W (cr) M/ V/| + 
A/a 



EEl c w(^)/x,/ 7 ^(0H',/ I 7) M J(>v) Mj /^7| + 

EEk w (0w) M , / (^a) A1 , / -^ 7 ^-V/V7| =:I + II + III + IV. 

We can notice that because | cljj \ < eacn sum inside L can be estimated by a perfect 

product of S and R terms, where 

7S!.. \ w )hj \M L ) 

5 L (0w):= £ (^^) M J(^^ 
7cl... yV-KH 
and the corresponding terms for i//a. So we have 

/ < £5 L (0w)5 L (i7/a), // < £5 l (0w)/? l (va(7), 

L L 
L L 

Now 



(35) S L («H < / £ K^DmI 2 J(w) MX , 5 L (ii/a) < / £ |(^a,/ty)| 2 A /(a) M , L 

V /CL... V JcL - 

Therefore, 

(36) /<CMy 2 2 H0l|H.||^|| ff . 

Terms are symmetric, so consider Using Bellman function (xy) a one can prove now 

Lemma 8.3. The sequence 



/brm a Carleson measure with Carleson constant at most c a Q a , where Q := [w]ai for any a € 
(0,1/2). 

Proof. We need a very simple 

Sublemma. Let Q > 1,0 < Of < \. In domain £2g := {(*,y) : X > o,y > 0, 1 < xy < Q function 
BQ(x,y) := x a y a satisfies the following estimate of its Hessian matrix (of its second differential 
form, actually) 

-d 2 B Q (x,y) > «(1 - 2a)x a y a + . 

The form —d 2 BQ(x,y) > everywhere in x > 0,y > 0. Also obviously < BQ(x,y) < Q a in Q.q. 
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Proof. Direct calculation. □ 

Fix now a Christ's cube / and let Si(I),i = 1,...,M, be all its sons. Let a = ((w)^j,{o)^j), 
bi = ((w)u (g)u ,.v,(/)), i= 1, ■ • • ,M, be points-obviously-in £2g, where Q temporarily means 
[w]a 2 - Consider c;(?) = a(l — f) + ft,?,0 < ? < 1 and <7,-(f) := 73g(c,(f)). We want to use Taylor's 
formula 

(37) $/(0)-?/(l) = -4(0)- f dx [ X q'!(t)dt. 

Jo Jo 

Notice two things: Sublemma shows that —q'/(t) > always. Moreover, it shows that if t G 
[0, 1/2], then we have the following qualitative estimate holds 

(38) -?iW>c(M Mi /(a) Mi/ ) —2 + —-2 

This requires a small explanation. If we are on the segment [a,fc;], then the first coordinate of such 
a point cannot be larger than C(w)^, where C depends only on doubling of ji (not w). This is 
obvious. The same is true for the second coordinate with the obvious change of w to a. But there 
is no such type of estimate from below on this segment: the first coordinate cannot be smaller 
than k {w)^j, but k may (and will) depend on the doubling of w (so ultimately on its [w]a 2 norm. 
In fact, at the "right" endpoint of [a,fe/]. The first coordinate is {w)^ s .^ < fjwdn/n(si(I)) < 
C Jj wdn/n(I)) = C(w) iU ,7, with C only depending on the doubling of /j. But the estimate from 
below will involve the doubling of w, which we must avoid. But if t G [0, 1/2], and we are on 
the "left half" of interval [a,bi] then obviously the first coordinate is > j(w)^ : j and the second 
coordinate is > ^(o)^j. 

We do not need to integrate —q'[(t) for all t £ [0, 1] in (l37l) . We can only use integration over 
[0, 1 /2] noticing that —q'/(t) > otherwise. Then the chain rule 

q'!{t) = (B Q ( Ci (t))" = (d 2 B Q {ci(t)(bi-a),b t -a) 

immediately gives us (|38T > with constant c depending on the doubling of but independent of the 
doubling of w. 

Next step is to add all J37l . with convex coefficients , and to notice that Ylf=\ ^ffl tfi (°) = 

VB Q (a) lf =l ■ (a - bi)^0^- = 0, because by definition 

Notice that the addition of all (1371 ). with convex coefficients ^TTfff gives us now ( we take into 
account (I38T ) and positivity of —q"(t)) 

1 « (,1 -ii7(ir IsW '" ,IH '* w) SI — wl — + — we — J 



We used here the doubling of /i again, by noticing that > ci (recall that *,•(/) and / are almost 

balls of comparable radii). We rewrite the previous inequality using our definition of A/w,A/<7 
listed above as follows 

H(I)B Q (a) - f>(*M)*fl(*0 > cc, ((w) M , / (a) /U )« ( + W) " 

Notice that Bq(o) = (w)^ i /(a)^ i /. Now we iterate the above inequality and get for any of Christ's 
dyadic /'s: 
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This is exactly the Carleson property of the measure {?)} indicated in our Lemma [83] with Car- 
leson constant CQ a . The proof showed that C depended only on a € (0, 1/2) and on the doubling 
constant of measure /x. 

□ 

Now, using this lemma, we start to estimate our Sl's and R^'s. For Sl^ct) we already had 
estimate (1351) . 

To estimate Rl{§w) let us denote by &l maximal stopping intervals K G Q),K C L, where the 
stopping criteria are 1) either |g > or ^gL > ^h+T, or 2) = g(L) +m. 

Lemma 8.4. /f^T is any stopping interval then 
(39) 

I K<MM/l^^<2^+n+l)<l^^ 
Proof. If we stop by the first criterion, then 



<2(m + «+l)(|0|w)^^^V%M,l /2 (a)^ /2 . 



Now replacing (w) fl 'g 2 (g) fl C g 2 by (w) ^ 2 (g) ^ 2 does not grow the estimate by more than 
e a as all pairs of son/father intervals larger than K and smaller than L will have there averages 
compared by constant at most 1 ± m+ ) i+1 ■ And there are at most m such intervals between K and 
L. 

If we stop by the second criterion, then K is one of /'s, g(I) = g(L) + m, and 



\ - |A/w| H(I) .. . . M(*0 l A *H ^/ui » /=-, v-a/2. v -a/2 



Now we replace (vv) m J /2 (ct) m J /2 by (w)^" /2 (a) M " /2 as before. 
Now 



□ 



Rl(<Pw) <C(m + n+l)(w)^ L /2 (a)/ L /2 £ ^2§^ 

1 /2 

<C(m + n + l)( W );f(a);f( £ (I0k) 2 ,^) (^/ 2 , 

where 

Notice that the sequence {^L]ie9 form a Carleson sequence (measure) with constant at most 
C{m+\)Q a . 

Now we make a trick! We will estimate the right hand side as 

R L ^ W )<C{m + n + l){ W )-f{a)-f( £ (|f| w )J ^V'^)^, 

\Ke&> L AH^ / 
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where p = 2 - ^j+x- In fact, 



KcL, K is maximal H-V- 1 ) / Ke3» L \t*\^J 



I 



H(K) ^ 2(m+ " +l) 



But if if < j < m, then (C J ) '»+"+' < C, and therefore in the formula above ^ J 

C , and C depends only on th 
using Cauchy inequality, one gets 



C^y, and C depends only on the doubling constant of p. So the trick is justified. Therefore, 



1/2 



R L ^w)<C(m + n + l)(w)-^ 2 (a)-^ 2 ( £ (IC^W^^Y Vl) 1/2 ■ 

We can replace all (w)^ 1 by (w)^ 1 paying the price by constant. This is again because all 
intervals larger than K and smaller than L will have there averages compared by constant at most 
1 ± m+ * +1 . And there are at most m such intervals between K and L. Finally, 

(40) R^w)<C(m + n+l)(w)^ L /2 (a)^ L /2 ( £ ^\p w ) ^ K ^\'\ w )^} {\ 

We need the standard notations: if v is an arbitrary positive measure we denote 
My f{x) :=sup 1 [ \f(x)\dv(x). 

r>0 V(S(x,r)) J B (x,r) 

In particular M vv will stand for this maximal function with dv = w{x) dp. 

From (l40l we get 

(41) Rd<pw)<C(m + n+l)(w)l- L a/2 (o)-^ /2 

Now 
(42) 



(<j)15 v ' 

(43) /? L (y«7)/?zXM<^ + «+l)M^^ 

Now we use the Carleson property of We need a simple folklore Lemma. 

Lemma 8.5. Let {(X^LeS) define Carleson measure with intensity B related to dyadic lattice 2 
on metric space X. Let F be a positive function on X. Then 

(44) E( infF ) a L <2B f Fd/l. 

L L J * 

^ inf L F f F 

(45) £-^a L <Cfi / -d^. 

Now use (l42i Then the estimate of /// < Ll^l(V A(J )^l(0w') will be reduced to estimating 



+ »+i)g'-^(£ "^y\ ) <( m+ » + i) 2 G(/ t (M„. ( i*ro) 2 "v*) 

< (j^) l/p (m + n+l) 2 Q^j^ 2 wdt?j ' <(m + n+l) 3 Q^j^ 2 wdp^ 
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Here we used (|45T ) and the usual estimates of maximal function in L q {pi) when q « 1. Of 
course for // we use the symmetric reasoning. 



Now W : we use (|43T > first. 

L L L L 

<C(m + n+lfQ f {M w m P )Y IP {M a {\ W \P)) y IP W l l 2 o l l 2 dpL 

JR 



<C(m + n+l) 4 Q( [ (fwdfl) ( [ ypadfl 
Here we used (1441 ) and the usual estimates of maximal function M„ in l}' p {\l) when p m 2, p < 2. 
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